Serveur d'exploration sur la recherche en informatique en Lorraine

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

Around Inverse Reinforcement Learning and Score-based Classification

Identifieur interne : 000F64 ( Main/Exploration ); précédent : 000F63; suivant : 000F65

Around Inverse Reinforcement Learning and Score-based Classification

Auteurs : Matthieu Geist [France] ; Edouard Klein [France] ; Bilal Piot [France] ; Yann Guermeur [France] ; Olivier Pietquin [France]

Source :

RBID : Hal:hal-00916936

Abstract

Inverse reinforcement learning (IRL) aims at estimating an unknown reward function optimized by some expert agent from interactions between this expert and the system to be controlled. One of its major application fields is imitation learning, where the goal is to imitate the expert, possibly in situations not encountered before. A classic and simple way to handle this problem is to see it as a classification problem, mapping states to actions. The potential issue with this approach is that classification does not take naturally into account the temporal structure of sequential decision making. Yet, many classification algorithms consist in learning a \textit {score function}, mapping state-action couples to values, such that the value of the action chosen by the expert is higher than the others. The \textit{decision rule} of the classifier maximizes the score over actions for a given state. This is curiously reminiscent of the \textit{state-action value function} in reinforcement learning, and of the associated \textit{greedy policy}. Based on this simple statement, we propose two IRL algorithms that incorporate the structure of the sequential decision making problem into some classifier in different ways. The first one, SCIRL (Structured Classification for IRL), starts from the observation that linearly parameterizing a reward function by some features imposes a linear parametrization of the Q-function by a so-called feature expectation. SCIRL simply uses (an estimate of) the expert feature expectation as the basis function of the score function. The second algorithm, CSI (Cascaded Supervised IRL), applies a reversed Bellman equation (expressing the reward as a function of the Q-function) to the score function outputted by any score-based classifier, which reduces to a simple (and generic) regression step. These two algorithms come with theoretical guarantees and perform competitively on toy problems.

Url:


Affiliations:


Links toward previous steps (curation, corpus...)


Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">Around Inverse Reinforcement Learning and Score-based Classification</title>
<author>
<name sortKey="Geist, Matthieu" sort="Geist, Matthieu" uniqKey="Geist M" first="Matthieu" last="Geist">Matthieu Geist</name>
<affiliation wicri:level="1">
<hal:affiliation type="researchteam" xml:id="struct-389760" status="INCOMING">
<orgName>IMS - Equipe Information, Multimodalité et Signal</orgName>
<desc>
<address>
<country key="FR"></country>
</address>
</desc>
<listRelation>
<relation active="#struct-26305" type="direct"></relation>
<relation active="#struct-300812" type="indirect"></relation>
</listRelation>
<tutelles>
<tutelle active="#struct-26305" type="direct">
<org type="laboratory" xml:id="struct-26305" status="VALID">
<orgName>SUPELEC-Campus Metz</orgName>
<desc>
<address>
<addrLine>2 rue Edouard Belin 57070 Metz</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.metz.supelec.fr/metz/</ref>
</desc>
<listRelation>
<relation active="#struct-300812" type="direct"></relation>
</listRelation>
</org>
</tutelle>
<tutelle active="#struct-300812" type="indirect">
<org type="institution" xml:id="struct-300812" status="VALID">
<orgName>SUPELEC</orgName>
<desc>
<address>
<country key="FR"></country>
</address>
</desc>
</org>
</tutelle>
</tutelles>
</hal:affiliation>
<country>France</country>
</affiliation>
</author>
<author>
<name sortKey="Klein, Edouard" sort="Klein, Edouard" uniqKey="Klein E" first="Edouard" last="Klein">Edouard Klein</name>
<affiliation wicri:level="1">
<hal:affiliation type="researchteam" xml:id="struct-394500" status="INCOMING">
<orgName>IMS - Equipe Information, Multimodalité et Signal</orgName>
<desc>
<address>
<country key="FR"></country>
</address>
</desc>
<listRelation>
<relation active="#struct-24541" type="direct"></relation>
<relation active="#struct-242365" type="indirect"></relation>
<relation active="#struct-300289" type="indirect"></relation>
<relation active="#struct-300413" type="indirect"></relation>
<relation active="#struct-300812" type="indirect"></relation>
<relation active="#struct-301990" type="indirect"></relation>
<relation active="#struct-301991" type="indirect"></relation>
<relation active="#struct-411575" type="indirect"></relation>
<relation name="UMI2958" active="#struct-441569" type="indirect"></relation>
<relation active="#struct-26305" type="direct"></relation>
</listRelation>
<tutelles>
<tutelle active="#struct-24541" type="direct">
<org type="laboratory" xml:id="struct-24541" status="VALID">
<idno type="RNSR">200619366D</idno>
<orgName>Georgia Tech - CNRS [Metz]</orgName>
<orgName type="acronym">UMI2958</orgName>
<desc>
<address>
<addrLine>Metz Technopôle 2-3 rue Marconi 57070 METZ</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.umi2958.eu</ref>
</desc>
<listRelation>
<relation active="#struct-242365" type="direct"></relation>
<relation active="#struct-300289" type="direct"></relation>
<relation active="#struct-300413" type="direct"></relation>
<relation active="#struct-300812" type="direct"></relation>
<relation active="#struct-301990" type="direct"></relation>
<relation active="#struct-301991" type="direct"></relation>
<relation active="#struct-411575" type="direct"></relation>
<relation name="UMI2958" active="#struct-441569" type="direct"></relation>
</listRelation>
</org>
</tutelle>
<tutelle active="#struct-242365" type="indirect">
<org type="institution" xml:id="struct-242365" status="VALID">
<idno type="IdRef">026403188</idno>
<idno type="ISNI">0000 0001 2188 3779 </idno>
<orgName>Université de Franche-Comté</orgName>
<orgName type="acronym">UFC</orgName>
<desc>
<address>
<country key="FR"></country>
</address>
<ref type="url">http://www.univ-fcomte.fr</ref>
</desc>
</org>
</tutelle>
<tutelle active="#struct-300289" type="indirect">
<org type="institution" xml:id="struct-300289" status="OLD">
<orgName>Université Paul Verlaine - Metz</orgName>
<orgName type="acronym">UPVM</orgName>
<date type="end">2011-12-31</date>
<desc>
<address>
<country key="FR"></country>
</address>
</desc>
</org>
</tutelle>
<tutelle active="#struct-300413" type="indirect">
<org type="institution" xml:id="struct-300413" status="VALID">
<orgName>Ecole Nationale Supérieure des Arts et Metiers Metz</orgName>
<desc>
<address>
<country key="FR"></country>
</address>
</desc>
</org>
</tutelle>
<tutelle active="#struct-300812" type="indirect">
<org type="institution" xml:id="struct-300812" status="VALID">
<orgName>SUPELEC</orgName>
<desc>
<address>
<country key="FR"></country>
</address>
</desc>
</org>
</tutelle>
<tutelle active="#struct-301990" type="indirect">
<org type="institution" xml:id="struct-301990" status="VALID">
<orgName>Georgia Institute of Technology [Atlanta]</orgName>
<desc>
<address>
<addrLine>North Avenue, Atlanta, GA 30332</addrLine>
<country key="US"></country>
</address>
<ref type="url">http://www.gatech.edu/</ref>
</desc>
</org>
</tutelle>
<tutelle active="#struct-301991" type="indirect">
<org type="institution" xml:id="struct-301991" status="VALID">
<orgName>Georgia Tech Lorraine</orgName>
<desc>
<address>
<country key="FR"></country>
</address>
</desc>
</org>
</tutelle>
<tutelle active="#struct-411575" type="indirect">
<org type="institution" xml:id="struct-411575" status="VALID">
<orgName>CentraleSupélec</orgName>
<desc>
<address>
<addrLine>3, rue Joliot Curie,Plateau de Moulon,91192 GIF-SUR-YVETTE Cedex</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.centralesupelec.fr</ref>
</desc>
</org>
</tutelle>
<tutelle name="UMI2958" active="#struct-441569" type="indirect">
<org type="institution" xml:id="struct-441569" status="VALID">
<idno type="ISNI">0000000122597504</idno>
<idno type="IdRef">02636817X</idno>
<orgName>Centre National de la Recherche Scientifique</orgName>
<orgName type="acronym">CNRS</orgName>
<date type="start">1939-10-19</date>
<desc>
<address>
<country key="FR"></country>
</address>
<ref type="url">http://www.cnrs.fr/</ref>
</desc>
</org>
</tutelle>
<tutelle active="#struct-26305" type="direct">
<org type="laboratory" xml:id="struct-26305" status="VALID">
<orgName>SUPELEC-Campus Metz</orgName>
<desc>
<address>
<addrLine>2 rue Edouard Belin 57070 Metz</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.metz.supelec.fr/metz/</ref>
</desc>
<listRelation>
<relation active="#struct-300812" type="direct"></relation>
</listRelation>
</org>
</tutelle>
</tutelles>
</hal:affiliation>
<country>France</country>
<placeName>
<settlement type="city" wicri:auto="siege">Besançon</settlement>
<region type="region" nuts="2">Franche-Comté</region>
</placeName>
<orgName type="university">Université de Franche-Comté</orgName>
<orgName type="institution" wicri:auto="newGroup">Université de Bourgogne Franche-Comté</orgName>
<placeName>
<settlement type="city">Metz</settlement>
<region type="region" nuts="2">Grand Est</region>
<region type="old region" nuts="2">Lorraine (région)</region>
</placeName>
<orgName type="university">Université Paul Verlaine - Metz</orgName>
<orgName type="institution" wicri:auto="newGroup">Université de Lorraine</orgName>
</affiliation>
</author>
<author>
<name sortKey="Piot, Bilal" sort="Piot, Bilal" uniqKey="Piot B" first="Bilal" last="Piot">Bilal Piot</name>
<affiliation wicri:level="1">
<hal:affiliation type="researchteam" xml:id="struct-394500" status="INCOMING">
<orgName>IMS - Equipe Information, Multimodalité et Signal</orgName>
<desc>
<address>
<country key="FR"></country>
</address>
</desc>
<listRelation>
<relation active="#struct-24541" type="direct"></relation>
<relation active="#struct-242365" type="indirect"></relation>
<relation active="#struct-300289" type="indirect"></relation>
<relation active="#struct-300413" type="indirect"></relation>
<relation active="#struct-300812" type="indirect"></relation>
<relation active="#struct-301990" type="indirect"></relation>
<relation active="#struct-301991" type="indirect"></relation>
<relation active="#struct-411575" type="indirect"></relation>
<relation name="UMI2958" active="#struct-441569" type="indirect"></relation>
<relation active="#struct-26305" type="direct"></relation>
</listRelation>
<tutelles>
<tutelle active="#struct-24541" type="direct">
<org type="laboratory" xml:id="struct-24541" status="VALID">
<idno type="RNSR">200619366D</idno>
<orgName>Georgia Tech - CNRS [Metz]</orgName>
<orgName type="acronym">UMI2958</orgName>
<desc>
<address>
<addrLine>Metz Technopôle 2-3 rue Marconi 57070 METZ</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.umi2958.eu</ref>
</desc>
<listRelation>
<relation active="#struct-242365" type="direct"></relation>
<relation active="#struct-300289" type="direct"></relation>
<relation active="#struct-300413" type="direct"></relation>
<relation active="#struct-300812" type="direct"></relation>
<relation active="#struct-301990" type="direct"></relation>
<relation active="#struct-301991" type="direct"></relation>
<relation active="#struct-411575" type="direct"></relation>
<relation name="UMI2958" active="#struct-441569" type="direct"></relation>
</listRelation>
</org>
</tutelle>
<tutelle active="#struct-242365" type="indirect">
<org type="institution" xml:id="struct-242365" status="VALID">
<idno type="IdRef">026403188</idno>
<idno type="ISNI">0000 0001 2188 3779 </idno>
<orgName>Université de Franche-Comté</orgName>
<orgName type="acronym">UFC</orgName>
<desc>
<address>
<country key="FR"></country>
</address>
<ref type="url">http://www.univ-fcomte.fr</ref>
</desc>
</org>
</tutelle>
<tutelle active="#struct-300289" type="indirect">
<org type="institution" xml:id="struct-300289" status="OLD">
<orgName>Université Paul Verlaine - Metz</orgName>
<orgName type="acronym">UPVM</orgName>
<date type="end">2011-12-31</date>
<desc>
<address>
<country key="FR"></country>
</address>
</desc>
</org>
</tutelle>
<tutelle active="#struct-300413" type="indirect">
<org type="institution" xml:id="struct-300413" status="VALID">
<orgName>Ecole Nationale Supérieure des Arts et Metiers Metz</orgName>
<desc>
<address>
<country key="FR"></country>
</address>
</desc>
</org>
</tutelle>
<tutelle active="#struct-300812" type="indirect">
<org type="institution" xml:id="struct-300812" status="VALID">
<orgName>SUPELEC</orgName>
<desc>
<address>
<country key="FR"></country>
</address>
</desc>
</org>
</tutelle>
<tutelle active="#struct-301990" type="indirect">
<org type="institution" xml:id="struct-301990" status="VALID">
<orgName>Georgia Institute of Technology [Atlanta]</orgName>
<desc>
<address>
<addrLine>North Avenue, Atlanta, GA 30332</addrLine>
<country key="US"></country>
</address>
<ref type="url">http://www.gatech.edu/</ref>
</desc>
</org>
</tutelle>
<tutelle active="#struct-301991" type="indirect">
<org type="institution" xml:id="struct-301991" status="VALID">
<orgName>Georgia Tech Lorraine</orgName>
<desc>
<address>
<country key="FR"></country>
</address>
</desc>
</org>
</tutelle>
<tutelle active="#struct-411575" type="indirect">
<org type="institution" xml:id="struct-411575" status="VALID">
<orgName>CentraleSupélec</orgName>
<desc>
<address>
<addrLine>3, rue Joliot Curie,Plateau de Moulon,91192 GIF-SUR-YVETTE Cedex</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.centralesupelec.fr</ref>
</desc>
</org>
</tutelle>
<tutelle name="UMI2958" active="#struct-441569" type="indirect">
<org type="institution" xml:id="struct-441569" status="VALID">
<idno type="ISNI">0000000122597504</idno>
<idno type="IdRef">02636817X</idno>
<orgName>Centre National de la Recherche Scientifique</orgName>
<orgName type="acronym">CNRS</orgName>
<date type="start">1939-10-19</date>
<desc>
<address>
<country key="FR"></country>
</address>
<ref type="url">http://www.cnrs.fr/</ref>
</desc>
</org>
</tutelle>
<tutelle active="#struct-26305" type="direct">
<org type="laboratory" xml:id="struct-26305" status="VALID">
<orgName>SUPELEC-Campus Metz</orgName>
<desc>
<address>
<addrLine>2 rue Edouard Belin 57070 Metz</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.metz.supelec.fr/metz/</ref>
</desc>
<listRelation>
<relation active="#struct-300812" type="direct"></relation>
</listRelation>
</org>
</tutelle>
</tutelles>
</hal:affiliation>
<country>France</country>
<placeName>
<settlement type="city" wicri:auto="siege">Besançon</settlement>
<region type="region" nuts="2">Franche-Comté</region>
</placeName>
<orgName type="university">Université de Franche-Comté</orgName>
<orgName type="institution" wicri:auto="newGroup">Université de Bourgogne Franche-Comté</orgName>
<placeName>
<settlement type="city">Metz</settlement>
<region type="region" nuts="2">Grand Est</region>
<region type="old region" nuts="2">Lorraine (région)</region>
</placeName>
<orgName type="university">Université Paul Verlaine - Metz</orgName>
<orgName type="institution" wicri:auto="newGroup">Université de Lorraine</orgName>
</affiliation>
</author>
<author>
<name sortKey="Guermeur, Yann" sort="Guermeur, Yann" uniqKey="Guermeur Y" first="Yann" last="Guermeur">Yann Guermeur</name>
<affiliation wicri:level="1">
<hal:affiliation type="researchteam" xml:id="struct-206035" status="VALID">
<orgName>Machine Learning and Computational Biology</orgName>
<orgName type="acronym">ABC</orgName>
<desc>
<address>
<country key="FR"></country>
</address>
<ref type="url">http://www.loria.fr/la-recherche/equipes/abc</ref>
</desc>
<listRelation>
<relation active="#struct-423083" type="direct"></relation>
<relation active="#struct-206040" type="indirect"></relation>
<relation active="#struct-300009" type="indirect"></relation>
<relation active="#struct-413289" type="indirect"></relation>
<relation name="UMR7503" active="#struct-441569" type="indirect"></relation>
</listRelation>
<tutelles>
<tutelle active="#struct-423083" type="direct">
<org type="department" xml:id="struct-423083" status="VALID">
<orgName>Department of Algorithms, Computation, Image and Geometry</orgName>
<orgName type="acronym">LORIA - ALGO</orgName>
<desc>
<address>
<country key="FR"></country>
</address>
<ref type="url">http://www.loria.fr/la-recherche-en/departements/algorithmics</ref>
</desc>
<listRelation>
<relation active="#struct-206040" type="direct"></relation>
<relation active="#struct-300009" type="indirect"></relation>
<relation active="#struct-413289" type="indirect"></relation>
<relation name="UMR7503" active="#struct-441569" type="indirect"></relation>
</listRelation>
</org>
</tutelle>
<tutelle active="#struct-206040" type="indirect">
<org type="laboratory" xml:id="struct-206040" status="VALID">
<idno type="IdRef">067077927</idno>
<idno type="RNSR">198912571S</idno>
<idno type="IdUnivLorraine">[UL]RSI--</idno>
<orgName>Laboratoire Lorrain de Recherche en Informatique et ses Applications</orgName>
<orgName type="acronym">LORIA</orgName>
<date type="start">2012-01-01</date>
<desc>
<address>
<addrLine>Campus Scientifique BP 239 54506 Vandoeuvre-lès-Nancy Cedex</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.loria.fr</ref>
</desc>
<listRelation>
<relation active="#struct-300009" type="direct"></relation>
<relation active="#struct-413289" type="direct"></relation>
<relation name="UMR7503" active="#struct-441569" type="direct"></relation>
</listRelation>
</org>
</tutelle>
<tutelle active="#struct-300009" type="indirect">
<org type="institution" xml:id="struct-300009" status="VALID">
<orgName>Institut National de Recherche en Informatique et en Automatique</orgName>
<orgName type="acronym">Inria</orgName>
<desc>
<address>
<addrLine>Domaine de VoluceauRocquencourt - BP 10578153 Le Chesnay Cedex</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.inria.fr/en/</ref>
</desc>
</org>
</tutelle>
<tutelle active="#struct-413289" type="indirect">
<org type="institution" xml:id="struct-413289" status="VALID">
<idno type="IdRef">157040569</idno>
<idno type="IdUnivLorraine">[UL]100--</idno>
<orgName>Université de Lorraine</orgName>
<orgName type="acronym">UL</orgName>
<date type="start">2012-01-01</date>
<desc>
<address>
<addrLine>34 cours Léopold - CS 25233 - 54052 Nancy cedex</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.univ-lorraine.fr/</ref>
</desc>
</org>
</tutelle>
<tutelle name="UMR7503" active="#struct-441569" type="indirect">
<org type="institution" xml:id="struct-441569" status="VALID">
<idno type="ISNI">0000000122597504</idno>
<idno type="IdRef">02636817X</idno>
<orgName>Centre National de la Recherche Scientifique</orgName>
<orgName type="acronym">CNRS</orgName>
<date type="start">1939-10-19</date>
<desc>
<address>
<country key="FR"></country>
</address>
<ref type="url">http://www.cnrs.fr/</ref>
</desc>
</org>
</tutelle>
</tutelles>
</hal:affiliation>
<country>France</country>
<placeName>
<settlement type="city">Nancy</settlement>
<settlement type="city">Metz</settlement>
<region type="region" nuts="2">Grand Est</region>
<region type="old region" nuts="2">Lorraine (région)</region>
</placeName>
<orgName type="university">Université de Lorraine</orgName>
</affiliation>
</author>
<author>
<name sortKey="Pietquin, Olivier" sort="Pietquin, Olivier" uniqKey="Pietquin O" first="Olivier" last="Pietquin">Olivier Pietquin</name>
<affiliation wicri:level="1">
<hal:affiliation type="researchteam" xml:id="struct-394500" status="INCOMING">
<orgName>IMS - Equipe Information, Multimodalité et Signal</orgName>
<desc>
<address>
<country key="FR"></country>
</address>
</desc>
<listRelation>
<relation active="#struct-24541" type="direct"></relation>
<relation active="#struct-242365" type="indirect"></relation>
<relation active="#struct-300289" type="indirect"></relation>
<relation active="#struct-300413" type="indirect"></relation>
<relation active="#struct-300812" type="indirect"></relation>
<relation active="#struct-301990" type="indirect"></relation>
<relation active="#struct-301991" type="indirect"></relation>
<relation active="#struct-411575" type="indirect"></relation>
<relation name="UMI2958" active="#struct-441569" type="indirect"></relation>
<relation active="#struct-26305" type="direct"></relation>
</listRelation>
<tutelles>
<tutelle active="#struct-24541" type="direct">
<org type="laboratory" xml:id="struct-24541" status="VALID">
<idno type="RNSR">200619366D</idno>
<orgName>Georgia Tech - CNRS [Metz]</orgName>
<orgName type="acronym">UMI2958</orgName>
<desc>
<address>
<addrLine>Metz Technopôle 2-3 rue Marconi 57070 METZ</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.umi2958.eu</ref>
</desc>
<listRelation>
<relation active="#struct-242365" type="direct"></relation>
<relation active="#struct-300289" type="direct"></relation>
<relation active="#struct-300413" type="direct"></relation>
<relation active="#struct-300812" type="direct"></relation>
<relation active="#struct-301990" type="direct"></relation>
<relation active="#struct-301991" type="direct"></relation>
<relation active="#struct-411575" type="direct"></relation>
<relation name="UMI2958" active="#struct-441569" type="direct"></relation>
</listRelation>
</org>
</tutelle>
<tutelle active="#struct-242365" type="indirect">
<org type="institution" xml:id="struct-242365" status="VALID">
<idno type="IdRef">026403188</idno>
<idno type="ISNI">0000 0001 2188 3779 </idno>
<orgName>Université de Franche-Comté</orgName>
<orgName type="acronym">UFC</orgName>
<desc>
<address>
<country key="FR"></country>
</address>
<ref type="url">http://www.univ-fcomte.fr</ref>
</desc>
</org>
</tutelle>
<tutelle active="#struct-300289" type="indirect">
<org type="institution" xml:id="struct-300289" status="OLD">
<orgName>Université Paul Verlaine - Metz</orgName>
<orgName type="acronym">UPVM</orgName>
<date type="end">2011-12-31</date>
<desc>
<address>
<country key="FR"></country>
</address>
</desc>
</org>
</tutelle>
<tutelle active="#struct-300413" type="indirect">
<org type="institution" xml:id="struct-300413" status="VALID">
<orgName>Ecole Nationale Supérieure des Arts et Metiers Metz</orgName>
<desc>
<address>
<country key="FR"></country>
</address>
</desc>
</org>
</tutelle>
<tutelle active="#struct-300812" type="indirect">
<org type="institution" xml:id="struct-300812" status="VALID">
<orgName>SUPELEC</orgName>
<desc>
<address>
<country key="FR"></country>
</address>
</desc>
</org>
</tutelle>
<tutelle active="#struct-301990" type="indirect">
<org type="institution" xml:id="struct-301990" status="VALID">
<orgName>Georgia Institute of Technology [Atlanta]</orgName>
<desc>
<address>
<addrLine>North Avenue, Atlanta, GA 30332</addrLine>
<country key="US"></country>
</address>
<ref type="url">http://www.gatech.edu/</ref>
</desc>
</org>
</tutelle>
<tutelle active="#struct-301991" type="indirect">
<org type="institution" xml:id="struct-301991" status="VALID">
<orgName>Georgia Tech Lorraine</orgName>
<desc>
<address>
<country key="FR"></country>
</address>
</desc>
</org>
</tutelle>
<tutelle active="#struct-411575" type="indirect">
<org type="institution" xml:id="struct-411575" status="VALID">
<orgName>CentraleSupélec</orgName>
<desc>
<address>
<addrLine>3, rue Joliot Curie,Plateau de Moulon,91192 GIF-SUR-YVETTE Cedex</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.centralesupelec.fr</ref>
</desc>
</org>
</tutelle>
<tutelle name="UMI2958" active="#struct-441569" type="indirect">
<org type="institution" xml:id="struct-441569" status="VALID">
<idno type="ISNI">0000000122597504</idno>
<idno type="IdRef">02636817X</idno>
<orgName>Centre National de la Recherche Scientifique</orgName>
<orgName type="acronym">CNRS</orgName>
<date type="start">1939-10-19</date>
<desc>
<address>
<country key="FR"></country>
</address>
<ref type="url">http://www.cnrs.fr/</ref>
</desc>
</org>
</tutelle>
<tutelle active="#struct-26305" type="direct">
<org type="laboratory" xml:id="struct-26305" status="VALID">
<orgName>SUPELEC-Campus Metz</orgName>
<desc>
<address>
<addrLine>2 rue Edouard Belin 57070 Metz</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.metz.supelec.fr/metz/</ref>
</desc>
<listRelation>
<relation active="#struct-300812" type="direct"></relation>
</listRelation>
</org>
</tutelle>
</tutelles>
</hal:affiliation>
<country>France</country>
<placeName>
<settlement type="city" wicri:auto="siege">Besançon</settlement>
<region type="region" nuts="2">Franche-Comté</region>
</placeName>
<orgName type="university">Université de Franche-Comté</orgName>
<orgName type="institution" wicri:auto="newGroup">Université de Bourgogne Franche-Comté</orgName>
<placeName>
<settlement type="city">Metz</settlement>
<region type="region" nuts="2">Grand Est</region>
<region type="old region" nuts="2">Lorraine (région)</region>
</placeName>
<orgName type="university">Université Paul Verlaine - Metz</orgName>
<orgName type="institution" wicri:auto="newGroup">Université de Lorraine</orgName>
</affiliation>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">HAL</idno>
<idno type="RBID">Hal:hal-00916936</idno>
<idno type="halId">hal-00916936</idno>
<idno type="halUri">https://hal-supelec.archives-ouvertes.fr/hal-00916936</idno>
<idno type="url">https://hal-supelec.archives-ouvertes.fr/hal-00916936</idno>
<date when="2013-10-25">2013-10-25</date>
<idno type="wicri:Area/Hal/Corpus">000F85</idno>
<idno type="wicri:Area/Hal/Curation">000F85</idno>
<idno type="wicri:Area/Hal/Checkpoint">000E66</idno>
<idno type="wicri:explorRef" wicri:stream="Hal" wicri:step="Checkpoint">000E66</idno>
<idno type="wicri:Area/Main/Merge">000F75</idno>
<idno type="wicri:Area/Main/Curation">000F64</idno>
<idno type="wicri:Area/Main/Exploration">000F64</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en">Around Inverse Reinforcement Learning and Score-based Classification</title>
<author>
<name sortKey="Geist, Matthieu" sort="Geist, Matthieu" uniqKey="Geist M" first="Matthieu" last="Geist">Matthieu Geist</name>
<affiliation wicri:level="1">
<hal:affiliation type="researchteam" xml:id="struct-389760" status="INCOMING">
<orgName>IMS - Equipe Information, Multimodalité et Signal</orgName>
<desc>
<address>
<country key="FR"></country>
</address>
</desc>
<listRelation>
<relation active="#struct-26305" type="direct"></relation>
<relation active="#struct-300812" type="indirect"></relation>
</listRelation>
<tutelles>
<tutelle active="#struct-26305" type="direct">
<org type="laboratory" xml:id="struct-26305" status="VALID">
<orgName>SUPELEC-Campus Metz</orgName>
<desc>
<address>
<addrLine>2 rue Edouard Belin 57070 Metz</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.metz.supelec.fr/metz/</ref>
</desc>
<listRelation>
<relation active="#struct-300812" type="direct"></relation>
</listRelation>
</org>
</tutelle>
<tutelle active="#struct-300812" type="indirect">
<org type="institution" xml:id="struct-300812" status="VALID">
<orgName>SUPELEC</orgName>
<desc>
<address>
<country key="FR"></country>
</address>
</desc>
</org>
</tutelle>
</tutelles>
</hal:affiliation>
<country>France</country>
</affiliation>
</author>
<author>
<name sortKey="Klein, Edouard" sort="Klein, Edouard" uniqKey="Klein E" first="Edouard" last="Klein">Edouard Klein</name>
<affiliation wicri:level="1">
<hal:affiliation type="researchteam" xml:id="struct-394500" status="INCOMING">
<orgName>IMS - Equipe Information, Multimodalité et Signal</orgName>
<desc>
<address>
<country key="FR"></country>
</address>
</desc>
<listRelation>
<relation active="#struct-24541" type="direct"></relation>
<relation active="#struct-242365" type="indirect"></relation>
<relation active="#struct-300289" type="indirect"></relation>
<relation active="#struct-300413" type="indirect"></relation>
<relation active="#struct-300812" type="indirect"></relation>
<relation active="#struct-301990" type="indirect"></relation>
<relation active="#struct-301991" type="indirect"></relation>
<relation active="#struct-411575" type="indirect"></relation>
<relation name="UMI2958" active="#struct-441569" type="indirect"></relation>
<relation active="#struct-26305" type="direct"></relation>
</listRelation>
<tutelles>
<tutelle active="#struct-24541" type="direct">
<org type="laboratory" xml:id="struct-24541" status="VALID">
<idno type="RNSR">200619366D</idno>
<orgName>Georgia Tech - CNRS [Metz]</orgName>
<orgName type="acronym">UMI2958</orgName>
<desc>
<address>
<addrLine>Metz Technopôle 2-3 rue Marconi 57070 METZ</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.umi2958.eu</ref>
</desc>
<listRelation>
<relation active="#struct-242365" type="direct"></relation>
<relation active="#struct-300289" type="direct"></relation>
<relation active="#struct-300413" type="direct"></relation>
<relation active="#struct-300812" type="direct"></relation>
<relation active="#struct-301990" type="direct"></relation>
<relation active="#struct-301991" type="direct"></relation>
<relation active="#struct-411575" type="direct"></relation>
<relation name="UMI2958" active="#struct-441569" type="direct"></relation>
</listRelation>
</org>
</tutelle>
<tutelle active="#struct-242365" type="indirect">
<org type="institution" xml:id="struct-242365" status="VALID">
<idno type="IdRef">026403188</idno>
<idno type="ISNI">0000 0001 2188 3779 </idno>
<orgName>Université de Franche-Comté</orgName>
<orgName type="acronym">UFC</orgName>
<desc>
<address>
<country key="FR"></country>
</address>
<ref type="url">http://www.univ-fcomte.fr</ref>
</desc>
</org>
</tutelle>
<tutelle active="#struct-300289" type="indirect">
<org type="institution" xml:id="struct-300289" status="OLD">
<orgName>Université Paul Verlaine - Metz</orgName>
<orgName type="acronym">UPVM</orgName>
<date type="end">2011-12-31</date>
<desc>
<address>
<country key="FR"></country>
</address>
</desc>
</org>
</tutelle>
<tutelle active="#struct-300413" type="indirect">
<org type="institution" xml:id="struct-300413" status="VALID">
<orgName>Ecole Nationale Supérieure des Arts et Metiers Metz</orgName>
<desc>
<address>
<country key="FR"></country>
</address>
</desc>
</org>
</tutelle>
<tutelle active="#struct-300812" type="indirect">
<org type="institution" xml:id="struct-300812" status="VALID">
<orgName>SUPELEC</orgName>
<desc>
<address>
<country key="FR"></country>
</address>
</desc>
</org>
</tutelle>
<tutelle active="#struct-301990" type="indirect">
<org type="institution" xml:id="struct-301990" status="VALID">
<orgName>Georgia Institute of Technology [Atlanta]</orgName>
<desc>
<address>
<addrLine>North Avenue, Atlanta, GA 30332</addrLine>
<country key="US"></country>
</address>
<ref type="url">http://www.gatech.edu/</ref>
</desc>
</org>
</tutelle>
<tutelle active="#struct-301991" type="indirect">
<org type="institution" xml:id="struct-301991" status="VALID">
<orgName>Georgia Tech Lorraine</orgName>
<desc>
<address>
<country key="FR"></country>
</address>
</desc>
</org>
</tutelle>
<tutelle active="#struct-411575" type="indirect">
<org type="institution" xml:id="struct-411575" status="VALID">
<orgName>CentraleSupélec</orgName>
<desc>
<address>
<addrLine>3, rue Joliot Curie,Plateau de Moulon,91192 GIF-SUR-YVETTE Cedex</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.centralesupelec.fr</ref>
</desc>
</org>
</tutelle>
<tutelle name="UMI2958" active="#struct-441569" type="indirect">
<org type="institution" xml:id="struct-441569" status="VALID">
<idno type="ISNI">0000000122597504</idno>
<idno type="IdRef">02636817X</idno>
<orgName>Centre National de la Recherche Scientifique</orgName>
<orgName type="acronym">CNRS</orgName>
<date type="start">1939-10-19</date>
<desc>
<address>
<country key="FR"></country>
</address>
<ref type="url">http://www.cnrs.fr/</ref>
</desc>
</org>
</tutelle>
<tutelle active="#struct-26305" type="direct">
<org type="laboratory" xml:id="struct-26305" status="VALID">
<orgName>SUPELEC-Campus Metz</orgName>
<desc>
<address>
<addrLine>2 rue Edouard Belin 57070 Metz</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.metz.supelec.fr/metz/</ref>
</desc>
<listRelation>
<relation active="#struct-300812" type="direct"></relation>
</listRelation>
</org>
</tutelle>
</tutelles>
</hal:affiliation>
<country>France</country>
<placeName>
<settlement type="city" wicri:auto="siege">Besançon</settlement>
<region type="region" nuts="2">Franche-Comté</region>
</placeName>
<orgName type="university">Université de Franche-Comté</orgName>
<orgName type="institution" wicri:auto="newGroup">Université de Bourgogne Franche-Comté</orgName>
<placeName>
<settlement type="city">Metz</settlement>
<region type="region" nuts="2">Grand Est</region>
<region type="old region" nuts="2">Lorraine (région)</region>
</placeName>
<orgName type="university">Université Paul Verlaine - Metz</orgName>
<orgName type="institution" wicri:auto="newGroup">Université de Lorraine</orgName>
</affiliation>
</author>
<author>
<name sortKey="Piot, Bilal" sort="Piot, Bilal" uniqKey="Piot B" first="Bilal" last="Piot">Bilal Piot</name>
<affiliation wicri:level="1">
<hal:affiliation type="researchteam" xml:id="struct-394500" status="INCOMING">
<orgName>IMS - Equipe Information, Multimodalité et Signal</orgName>
<desc>
<address>
<country key="FR"></country>
</address>
</desc>
<listRelation>
<relation active="#struct-24541" type="direct"></relation>
<relation active="#struct-242365" type="indirect"></relation>
<relation active="#struct-300289" type="indirect"></relation>
<relation active="#struct-300413" type="indirect"></relation>
<relation active="#struct-300812" type="indirect"></relation>
<relation active="#struct-301990" type="indirect"></relation>
<relation active="#struct-301991" type="indirect"></relation>
<relation active="#struct-411575" type="indirect"></relation>
<relation name="UMI2958" active="#struct-441569" type="indirect"></relation>
<relation active="#struct-26305" type="direct"></relation>
</listRelation>
<tutelles>
<tutelle active="#struct-24541" type="direct">
<org type="laboratory" xml:id="struct-24541" status="VALID">
<idno type="RNSR">200619366D</idno>
<orgName>Georgia Tech - CNRS [Metz]</orgName>
<orgName type="acronym">UMI2958</orgName>
<desc>
<address>
<addrLine>Metz Technopôle 2-3 rue Marconi 57070 METZ</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.umi2958.eu</ref>
</desc>
<listRelation>
<relation active="#struct-242365" type="direct"></relation>
<relation active="#struct-300289" type="direct"></relation>
<relation active="#struct-300413" type="direct"></relation>
<relation active="#struct-300812" type="direct"></relation>
<relation active="#struct-301990" type="direct"></relation>
<relation active="#struct-301991" type="direct"></relation>
<relation active="#struct-411575" type="direct"></relation>
<relation name="UMI2958" active="#struct-441569" type="direct"></relation>
</listRelation>
</org>
</tutelle>
<tutelle active="#struct-242365" type="indirect">
<org type="institution" xml:id="struct-242365" status="VALID">
<idno type="IdRef">026403188</idno>
<idno type="ISNI">0000 0001 2188 3779 </idno>
<orgName>Université de Franche-Comté</orgName>
<orgName type="acronym">UFC</orgName>
<desc>
<address>
<country key="FR"></country>
</address>
<ref type="url">http://www.univ-fcomte.fr</ref>
</desc>
</org>
</tutelle>
<tutelle active="#struct-300289" type="indirect">
<org type="institution" xml:id="struct-300289" status="OLD">
<orgName>Université Paul Verlaine - Metz</orgName>
<orgName type="acronym">UPVM</orgName>
<date type="end">2011-12-31</date>
<desc>
<address>
<country key="FR"></country>
</address>
</desc>
</org>
</tutelle>
<tutelle active="#struct-300413" type="indirect">
<org type="institution" xml:id="struct-300413" status="VALID">
<orgName>Ecole Nationale Supérieure des Arts et Metiers Metz</orgName>
<desc>
<address>
<country key="FR"></country>
</address>
</desc>
</org>
</tutelle>
<tutelle active="#struct-300812" type="indirect">
<org type="institution" xml:id="struct-300812" status="VALID">
<orgName>SUPELEC</orgName>
<desc>
<address>
<country key="FR"></country>
</address>
</desc>
</org>
</tutelle>
<tutelle active="#struct-301990" type="indirect">
<org type="institution" xml:id="struct-301990" status="VALID">
<orgName>Georgia Institute of Technology [Atlanta]</orgName>
<desc>
<address>
<addrLine>North Avenue, Atlanta, GA 30332</addrLine>
<country key="US"></country>
</address>
<ref type="url">http://www.gatech.edu/</ref>
</desc>
</org>
</tutelle>
<tutelle active="#struct-301991" type="indirect">
<org type="institution" xml:id="struct-301991" status="VALID">
<orgName>Georgia Tech Lorraine</orgName>
<desc>
<address>
<country key="FR"></country>
</address>
</desc>
</org>
</tutelle>
<tutelle active="#struct-411575" type="indirect">
<org type="institution" xml:id="struct-411575" status="VALID">
<orgName>CentraleSupélec</orgName>
<desc>
<address>
<addrLine>3, rue Joliot Curie,Plateau de Moulon,91192 GIF-SUR-YVETTE Cedex</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.centralesupelec.fr</ref>
</desc>
</org>
</tutelle>
<tutelle name="UMI2958" active="#struct-441569" type="indirect">
<org type="institution" xml:id="struct-441569" status="VALID">
<idno type="ISNI">0000000122597504</idno>
<idno type="IdRef">02636817X</idno>
<orgName>Centre National de la Recherche Scientifique</orgName>
<orgName type="acronym">CNRS</orgName>
<date type="start">1939-10-19</date>
<desc>
<address>
<country key="FR"></country>
</address>
<ref type="url">http://www.cnrs.fr/</ref>
</desc>
</org>
</tutelle>
<tutelle active="#struct-26305" type="direct">
<org type="laboratory" xml:id="struct-26305" status="VALID">
<orgName>SUPELEC-Campus Metz</orgName>
<desc>
<address>
<addrLine>2 rue Edouard Belin 57070 Metz</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.metz.supelec.fr/metz/</ref>
</desc>
<listRelation>
<relation active="#struct-300812" type="direct"></relation>
</listRelation>
</org>
</tutelle>
</tutelles>
</hal:affiliation>
<country>France</country>
<placeName>
<settlement type="city" wicri:auto="siege">Besançon</settlement>
<region type="region" nuts="2">Franche-Comté</region>
</placeName>
<orgName type="university">Université de Franche-Comté</orgName>
<orgName type="institution" wicri:auto="newGroup">Université de Bourgogne Franche-Comté</orgName>
<placeName>
<settlement type="city">Metz</settlement>
<region type="region" nuts="2">Grand Est</region>
<region type="old region" nuts="2">Lorraine (région)</region>
</placeName>
<orgName type="university">Université Paul Verlaine - Metz</orgName>
<orgName type="institution" wicri:auto="newGroup">Université de Lorraine</orgName>
</affiliation>
</author>
<author>
<name sortKey="Guermeur, Yann" sort="Guermeur, Yann" uniqKey="Guermeur Y" first="Yann" last="Guermeur">Yann Guermeur</name>
<affiliation wicri:level="1">
<hal:affiliation type="researchteam" xml:id="struct-206035" status="VALID">
<orgName>Machine Learning and Computational Biology</orgName>
<orgName type="acronym">ABC</orgName>
<desc>
<address>
<country key="FR"></country>
</address>
<ref type="url">http://www.loria.fr/la-recherche/equipes/abc</ref>
</desc>
<listRelation>
<relation active="#struct-423083" type="direct"></relation>
<relation active="#struct-206040" type="indirect"></relation>
<relation active="#struct-300009" type="indirect"></relation>
<relation active="#struct-413289" type="indirect"></relation>
<relation name="UMR7503" active="#struct-441569" type="indirect"></relation>
</listRelation>
<tutelles>
<tutelle active="#struct-423083" type="direct">
<org type="department" xml:id="struct-423083" status="VALID">
<orgName>Department of Algorithms, Computation, Image and Geometry</orgName>
<orgName type="acronym">LORIA - ALGO</orgName>
<desc>
<address>
<country key="FR"></country>
</address>
<ref type="url">http://www.loria.fr/la-recherche-en/departements/algorithmics</ref>
</desc>
<listRelation>
<relation active="#struct-206040" type="direct"></relation>
<relation active="#struct-300009" type="indirect"></relation>
<relation active="#struct-413289" type="indirect"></relation>
<relation name="UMR7503" active="#struct-441569" type="indirect"></relation>
</listRelation>
</org>
</tutelle>
<tutelle active="#struct-206040" type="indirect">
<org type="laboratory" xml:id="struct-206040" status="VALID">
<idno type="IdRef">067077927</idno>
<idno type="RNSR">198912571S</idno>
<idno type="IdUnivLorraine">[UL]RSI--</idno>
<orgName>Laboratoire Lorrain de Recherche en Informatique et ses Applications</orgName>
<orgName type="acronym">LORIA</orgName>
<date type="start">2012-01-01</date>
<desc>
<address>
<addrLine>Campus Scientifique BP 239 54506 Vandoeuvre-lès-Nancy Cedex</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.loria.fr</ref>
</desc>
<listRelation>
<relation active="#struct-300009" type="direct"></relation>
<relation active="#struct-413289" type="direct"></relation>
<relation name="UMR7503" active="#struct-441569" type="direct"></relation>
</listRelation>
</org>
</tutelle>
<tutelle active="#struct-300009" type="indirect">
<org type="institution" xml:id="struct-300009" status="VALID">
<orgName>Institut National de Recherche en Informatique et en Automatique</orgName>
<orgName type="acronym">Inria</orgName>
<desc>
<address>
<addrLine>Domaine de VoluceauRocquencourt - BP 10578153 Le Chesnay Cedex</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.inria.fr/en/</ref>
</desc>
</org>
</tutelle>
<tutelle active="#struct-413289" type="indirect">
<org type="institution" xml:id="struct-413289" status="VALID">
<idno type="IdRef">157040569</idno>
<idno type="IdUnivLorraine">[UL]100--</idno>
<orgName>Université de Lorraine</orgName>
<orgName type="acronym">UL</orgName>
<date type="start">2012-01-01</date>
<desc>
<address>
<addrLine>34 cours Léopold - CS 25233 - 54052 Nancy cedex</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.univ-lorraine.fr/</ref>
</desc>
</org>
</tutelle>
<tutelle name="UMR7503" active="#struct-441569" type="indirect">
<org type="institution" xml:id="struct-441569" status="VALID">
<idno type="ISNI">0000000122597504</idno>
<idno type="IdRef">02636817X</idno>
<orgName>Centre National de la Recherche Scientifique</orgName>
<orgName type="acronym">CNRS</orgName>
<date type="start">1939-10-19</date>
<desc>
<address>
<country key="FR"></country>
</address>
<ref type="url">http://www.cnrs.fr/</ref>
</desc>
</org>
</tutelle>
</tutelles>
</hal:affiliation>
<country>France</country>
<placeName>
<settlement type="city">Nancy</settlement>
<settlement type="city">Metz</settlement>
<region type="region" nuts="2">Grand Est</region>
<region type="old region" nuts="2">Lorraine (région)</region>
</placeName>
<orgName type="university">Université de Lorraine</orgName>
</affiliation>
</author>
<author>
<name sortKey="Pietquin, Olivier" sort="Pietquin, Olivier" uniqKey="Pietquin O" first="Olivier" last="Pietquin">Olivier Pietquin</name>
<affiliation wicri:level="1">
<hal:affiliation type="researchteam" xml:id="struct-394500" status="INCOMING">
<orgName>IMS - Equipe Information, Multimodalité et Signal</orgName>
<desc>
<address>
<country key="FR"></country>
</address>
</desc>
<listRelation>
<relation active="#struct-24541" type="direct"></relation>
<relation active="#struct-242365" type="indirect"></relation>
<relation active="#struct-300289" type="indirect"></relation>
<relation active="#struct-300413" type="indirect"></relation>
<relation active="#struct-300812" type="indirect"></relation>
<relation active="#struct-301990" type="indirect"></relation>
<relation active="#struct-301991" type="indirect"></relation>
<relation active="#struct-411575" type="indirect"></relation>
<relation name="UMI2958" active="#struct-441569" type="indirect"></relation>
<relation active="#struct-26305" type="direct"></relation>
</listRelation>
<tutelles>
<tutelle active="#struct-24541" type="direct">
<org type="laboratory" xml:id="struct-24541" status="VALID">
<idno type="RNSR">200619366D</idno>
<orgName>Georgia Tech - CNRS [Metz]</orgName>
<orgName type="acronym">UMI2958</orgName>
<desc>
<address>
<addrLine>Metz Technopôle 2-3 rue Marconi 57070 METZ</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.umi2958.eu</ref>
</desc>
<listRelation>
<relation active="#struct-242365" type="direct"></relation>
<relation active="#struct-300289" type="direct"></relation>
<relation active="#struct-300413" type="direct"></relation>
<relation active="#struct-300812" type="direct"></relation>
<relation active="#struct-301990" type="direct"></relation>
<relation active="#struct-301991" type="direct"></relation>
<relation active="#struct-411575" type="direct"></relation>
<relation name="UMI2958" active="#struct-441569" type="direct"></relation>
</listRelation>
</org>
</tutelle>
<tutelle active="#struct-242365" type="indirect">
<org type="institution" xml:id="struct-242365" status="VALID">
<idno type="IdRef">026403188</idno>
<idno type="ISNI">0000 0001 2188 3779 </idno>
<orgName>Université de Franche-Comté</orgName>
<orgName type="acronym">UFC</orgName>
<desc>
<address>
<country key="FR"></country>
</address>
<ref type="url">http://www.univ-fcomte.fr</ref>
</desc>
</org>
</tutelle>
<tutelle active="#struct-300289" type="indirect">
<org type="institution" xml:id="struct-300289" status="OLD">
<orgName>Université Paul Verlaine - Metz</orgName>
<orgName type="acronym">UPVM</orgName>
<date type="end">2011-12-31</date>
<desc>
<address>
<country key="FR"></country>
</address>
</desc>
</org>
</tutelle>
<tutelle active="#struct-300413" type="indirect">
<org type="institution" xml:id="struct-300413" status="VALID">
<orgName>Ecole Nationale Supérieure des Arts et Metiers Metz</orgName>
<desc>
<address>
<country key="FR"></country>
</address>
</desc>
</org>
</tutelle>
<tutelle active="#struct-300812" type="indirect">
<org type="institution" xml:id="struct-300812" status="VALID">
<orgName>SUPELEC</orgName>
<desc>
<address>
<country key="FR"></country>
</address>
</desc>
</org>
</tutelle>
<tutelle active="#struct-301990" type="indirect">
<org type="institution" xml:id="struct-301990" status="VALID">
<orgName>Georgia Institute of Technology [Atlanta]</orgName>
<desc>
<address>
<addrLine>North Avenue, Atlanta, GA 30332</addrLine>
<country key="US"></country>
</address>
<ref type="url">http://www.gatech.edu/</ref>
</desc>
</org>
</tutelle>
<tutelle active="#struct-301991" type="indirect">
<org type="institution" xml:id="struct-301991" status="VALID">
<orgName>Georgia Tech Lorraine</orgName>
<desc>
<address>
<country key="FR"></country>
</address>
</desc>
</org>
</tutelle>
<tutelle active="#struct-411575" type="indirect">
<org type="institution" xml:id="struct-411575" status="VALID">
<orgName>CentraleSupélec</orgName>
<desc>
<address>
<addrLine>3, rue Joliot Curie,Plateau de Moulon,91192 GIF-SUR-YVETTE Cedex</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.centralesupelec.fr</ref>
</desc>
</org>
</tutelle>
<tutelle name="UMI2958" active="#struct-441569" type="indirect">
<org type="institution" xml:id="struct-441569" status="VALID">
<idno type="ISNI">0000000122597504</idno>
<idno type="IdRef">02636817X</idno>
<orgName>Centre National de la Recherche Scientifique</orgName>
<orgName type="acronym">CNRS</orgName>
<date type="start">1939-10-19</date>
<desc>
<address>
<country key="FR"></country>
</address>
<ref type="url">http://www.cnrs.fr/</ref>
</desc>
</org>
</tutelle>
<tutelle active="#struct-26305" type="direct">
<org type="laboratory" xml:id="struct-26305" status="VALID">
<orgName>SUPELEC-Campus Metz</orgName>
<desc>
<address>
<addrLine>2 rue Edouard Belin 57070 Metz</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.metz.supelec.fr/metz/</ref>
</desc>
<listRelation>
<relation active="#struct-300812" type="direct"></relation>
</listRelation>
</org>
</tutelle>
</tutelles>
</hal:affiliation>
<country>France</country>
<placeName>
<settlement type="city" wicri:auto="siege">Besançon</settlement>
<region type="region" nuts="2">Franche-Comté</region>
</placeName>
<orgName type="university">Université de Franche-Comté</orgName>
<orgName type="institution" wicri:auto="newGroup">Université de Bourgogne Franche-Comté</orgName>
<placeName>
<settlement type="city">Metz</settlement>
<region type="region" nuts="2">Grand Est</region>
<region type="old region" nuts="2">Lorraine (région)</region>
</placeName>
<orgName type="university">Université Paul Verlaine - Metz</orgName>
<orgName type="institution" wicri:auto="newGroup">Université de Lorraine</orgName>
</affiliation>
</author>
</analytic>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc>
<textClass></textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">Inverse reinforcement learning (IRL) aims at estimating an unknown reward function optimized by some expert agent from interactions between this expert and the system to be controlled. One of its major application fields is imitation learning, where the goal is to imitate the expert, possibly in situations not encountered before. A classic and simple way to handle this problem is to see it as a classification problem, mapping states to actions. The potential issue with this approach is that classification does not take naturally into account the temporal structure of sequential decision making. Yet, many classification algorithms consist in learning a \textit {score function}, mapping state-action couples to values, such that the value of the action chosen by the expert is higher than the others. The \textit{decision rule} of the classifier maximizes the score over actions for a given state. This is curiously reminiscent of the \textit{state-action value function} in reinforcement learning, and of the associated \textit{greedy policy}. Based on this simple statement, we propose two IRL algorithms that incorporate the structure of the sequential decision making problem into some classifier in different ways. The first one, SCIRL (Structured Classification for IRL), starts from the observation that linearly parameterizing a reward function by some features imposes a linear parametrization of the Q-function by a so-called feature expectation. SCIRL simply uses (an estimate of) the expert feature expectation as the basis function of the score function. The second algorithm, CSI (Cascaded Supervised IRL), applies a reversed Bellman equation (expressing the reward as a function of the Q-function) to the score function outputted by any score-based classifier, which reduces to a simple (and generic) regression step. These two algorithms come with theoretical guarantees and perform competitively on toy problems.</div>
</front>
</TEI>
<affiliations>
<list>
<country>
<li>France</li>
</country>
<region>
<li>Franche-Comté</li>
<li>Grand Est</li>
<li>Lorraine (région)</li>
</region>
<settlement>
<li>Besançon</li>
<li>Metz</li>
<li>Nancy</li>
</settlement>
<orgName>
<li>Université Paul Verlaine - Metz</li>
<li>Université de Bourgogne Franche-Comté</li>
<li>Université de Franche-Comté</li>
<li>Université de Lorraine</li>
</orgName>
</list>
<tree>
<country name="France">
<noRegion>
<name sortKey="Geist, Matthieu" sort="Geist, Matthieu" uniqKey="Geist M" first="Matthieu" last="Geist">Matthieu Geist</name>
</noRegion>
<name sortKey="Guermeur, Yann" sort="Guermeur, Yann" uniqKey="Guermeur Y" first="Yann" last="Guermeur">Yann Guermeur</name>
<name sortKey="Klein, Edouard" sort="Klein, Edouard" uniqKey="Klein E" first="Edouard" last="Klein">Edouard Klein</name>
<name sortKey="Pietquin, Olivier" sort="Pietquin, Olivier" uniqKey="Pietquin O" first="Olivier" last="Pietquin">Olivier Pietquin</name>
<name sortKey="Piot, Bilal" sort="Piot, Bilal" uniqKey="Piot B" first="Bilal" last="Piot">Bilal Piot</name>
</country>
</tree>
</affiliations>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Wicri/Lorraine/explor/InforLorV4/Data/Main/Exploration
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000F64 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 000F64 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Wicri/Lorraine
   |area=    InforLorV4
   |flux=    Main
   |étape=   Exploration
   |type=    RBID
   |clé=     Hal:hal-00916936
   |texte=   Around Inverse Reinforcement Learning and Score-based Classification
}}

Wicri

This area was generated with Dilib version V0.6.33.
Data generation: Mon Jun 10 21:56:28 2019. Site generation: Fri Feb 25 15:29:27 2022